NOTICE:  Wben  government  or  ©tlaer  4rawings>  speoi- 
ficatlons  Or  otne?  data  are  used  for  any  purpose 
otner  than  In  GOnneGtion  with  a  definitely  related 
goveiTaaent  proeureaent  operation,  the  U.  $. 
Gtovemment  therehy  incurs  no  responsihility,  nor  any 
obligation  ‘vdaatsoever;  and  the  fact  that  the  Govern¬ 
ment  may  have  formulated,  furnished,  or  in  any  way 
supplied  the  said  drawings,  specifications,  or  other 
data  is  not  to  be  regarded  by  implication  or  other¬ 
wise  as  in  any  manner  licensing  the  holder  or  any 
Other  person  or  corporation,  or  conveying  any  ri^ts 
or  pertdssion  to  manufacture,  use  or  sell,  any 
patented  invention  that  may  In  any  way  be  related 
thereto. 
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SYSIEM  THOUBIE^SHQOTING 


As  we  ail  know,  the  conGept  of  trouible-shootihg  has  genefally  heen  used  to 
Summarize  the  process  of  isolatlrig  malfunetlOhS  in  eiectrohiG  equipments  I 
should  like  to  ej^lore  its  a:^llcatlon  to  the  diagnosis  of  perfoimanee  of 
large  operational  systems >  or  the  prototypes  of  such  systems >  and  their  parts* 
In  this  context)  as  with  electronic  equipment)  the  process  is  one  of  locating 
eos^onents  whose  malperformance  is  preventing  the  desired  system  output*  The 
purpose  is  to  dei^ermlne  remedial  action.  The  method  is  to  make  measurements 
at  various  locations* 

However,  system  trouhle«shooting  is  also  obviously  different*  It  occurs  on 
a  grander  scale*  The  system  co^onents  may  be  subsystems;  in  conand/control 
or  information  systems  these  may  be  the  sensors,  the  data  processing  and 
utilization  portions,  or  the  effector  parts  of  the  system*  or  one  may  trouble 
shoot  such  systems  to  locate  faults  as  resident  in  the  hardware)  in  the  com¬ 
puter  programs)  or  in  the  human  elements.  The  remedial  action  does  not  con¬ 
sist  of  replacing  and  repairing;  within  the  human  factors  eurea  it  may  take 
the  form  of  training,  of  selection  or  manning,  of  procedural  or  organizational 
changes,  or  of  equipment  or  coi^uter  program  redesign*  The  methods  are  more 
complex  than  applying  voltmeters  to  test  points.  Con^lex  system  and  subsystem 
tests  are  necessary;  detezminations  must  be  made  about  the  selection  of  meas¬ 
ures  and  the  criteria  against  which  to  match  them;  and  te^niques  must  be 
developed  used  for  relating  these  measures  to  each  other. 

^e  measurement  considerations  in  system  trouble- shooting  appear  to  have 
received  relatively  little  investigation,  and  this  is  one  reason  why  I  have 
selected  this  as  a  topic*  I  shall  discUSS  briefly  a  number  of  possible 
guidelines  and  draw  on  pertinent  work  \^lch  the  System  Development  Corporation 
and  Other  agencies  have  undertaken  in  connection  with  SAGE  and  other  systems. 

1*  Measures  used  for  diagnosis  may  differ  from  those  appropriate  for  obtain¬ 
ing  evaluations  of  system  capability* 

Evaluation  and  diagnosis  as  differing  goals  in  field  testing  have  been 
examined  by  Meisterf  and  Searle^^,  and  the  latter  has  e^iored  <^6 
difference  with  respect  to  the  selection  of  perfoxnance  measures*  The 
engineer  primiurily  concerned  with  improving  a  system,  Searle  says,  "wants 
a  system  performance  measure  >diich  can  be  used  to  detexmine  the  areas  in 
which  ii^revement  should  yield  the  highest  payoff,  and  to  doaonstrate 
whether  chiunges  are  effective,"  rather  than  s^e  magnitude  measm'e  with 
extremely  high  validity*  As  an  example,  he  takes  measures  of  CEP  for  a 
bombing  aircraft  system  based  on  distributions  of  error  obtained  through 
practice  "bombing"  of  domestic  targets*  Such  measures  lack  sufficient 


validity  for  evaluation  Of  system  Gapafeillty,  sinoe  many  relevant  factors 
were  not  present  in  tneir  derivation^  tent  they  oan  he  used  for  diagnosis 
of  relative  contrihutlons  of  various  suhsystems  to  the  total  system  error* 
certain  assumptions  must  he  made  about  the  general  nature  of  i^e  missing 
factors^  the  direction  of  their  influonce>  and  i^e  representativeness  of 
the  factors  which  are  present  in  the  practice  "bombing* "  searle  states; 
"Hence,  let  us  cease  worrying  for  the  moment  whether  the  obtained  CEP  of 
looO  yards,  say,  is  ^0  or  even  2000  ^ards  less  i^an  the  real  one*  we 
shall  use  the  obtained  ClP  to  study  whi^  system  suhfiMctions  and/or 
coi^onents  are  most  important,  and  then  see  whether  this  information  is 
helpful  in  identifying  the  directions  and  nature  of  most  needed  finrther 
developmental  work. " 

component  and  subsystem  measures  become  essential  and  must  be  related  to 
end  measures. 

Wolin^t  has  discussed  some  of  the  relationships  between  subsystem  measures 
and  endaperformanee  measures.  He  makes  the  point  that  step  2I  in  a  system 
may  operate  with  lOO  per  cent  efficiency  "in  the  sense  that,  for  every 
successful  processing  performed  in  stj^  i>  Step  ii  operates  perfectly. " 
However,  it  is  obvious  that  if  step  I  operates  with  only  50  per  cent 
efficiency,  the  system  will  operate  at  the  same  unsatisfactozy  level. 

Now,  as  Wolin  points  out,  making  Step  1  loo  per  cent  efficient  may  by  no 
means  bring  the  system  to  100  per  cent  efficiency,  because  "the  chances 
are  very  great  that  compatibility  between  Steps  1  and  ll  has  been  lost. " 

In  such  n  Case,  Step  II  must  be  redesigned  with  Step  I  in  mind. 

Another  problem  in  relating  subsystem  measures  to  each  other  is  the 
selection  of  a  common  denominator.  Use  of  the  same  kind  of  criterion 
measure  facilitates  detezmining  such  relationships.  For  example,  as  we 
all  know,  cost  figures  are  often  used  by  operations  analysts  for  this 
purpose.  Probability  estimates  constitute  another  common  denominator 
as  illustrated  in  a  description  by  Jones^  of  a  model  of  effectiveness  of 
an  Aircraft  Carrier  Attack  Force  in  tezms  of  the  ejected  number  of  tar¬ 
gets  destroyed  in  a  given  period  of  time.  Of  course,  various  kinds  of 
measuTOs  are  assembled  to  generate  the  probability  figures  in  the  model 
€U3d  submodels,  as  is  also  the  case  with  a  cost  criterion,  and  the  problems 
of  manipulating  these  are  not  always  easily  resolved. 

Perhaps  the  most  frequently  used  diagnostic  procedure  for  system  trouble¬ 
shooting  is  the  non-quantitative  one  of  trying  to  trace  back  from  result 
to  origin  analytically.  For  exanple,  in  an  air  defense  exercise  one 
might  discover  that  several  "targets"  never  were  intercepted  in  Sector 
A  and  that  their  tracks  were  never  cross-told  from  Sector  B.  it  may  be 
a  reasonable  deduction  that  proper  cross-telling  would  have  resulted  in 
their  interception.  However,  such  analyses  do  not  provide  iiufonintion 


as  to  tbe  relative  contribution  Of  Grossstelliag,  in  cOffiparisOn  witn 
other  system  funGtions,  to  the  total  system  malperformanGe.  It  is 
rather  rare  that  more  tuantitatlve,  generalized  relationships  are 
ascertained  in  oon^lex  systems,  as,  for  example,  kill  probability  as 
a  function  of  tracking  "goodness"  in  SAGE*  A  study  of  SAGE  end»measures, 
subsystem  measures  and  their  inter Brelationshlps  has  recently  heen  under¬ 
taken  at  sbc  by  j.  f.  Rowell,  using  data  from  training  and  tactical  evalu¬ 
ation  missions.  In  addition,  an  ei^erjmental  study  is  being  conducted  to 
assess  the  effects  of  degradation  in  the  surveillance  subsystem  on  the 
actions  of  the  weapons  team,  where  system  measiures  and  subsystem  measures 
are  the  same  (e.g. >  "kills"),  weapons  teams  alternateiy  receive  inputs 
actually  processed  by  the  surveillance  subsystem  operators  in  one  session 
and  processed  by  pre-programmed  "correct"  surveillance  actions  (i»e., 
untouched  by  human  surveillance  hands)  in  another  session. 

An  even  more  difficult  problem  is  to  distinguish  between  the  error  con¬ 
tributions  from  hardware,  from  computer  programs  and  from  personnel. 

Here  one  must  Obtain  performance  data  emanating  exclusively  from  one  or 
two  of  th:.is  classes  of  potential  sources,  shapero  and  Erlckson^^  have 
the  following  to  say  in  regard  to  isolating  human-lnitiated  malfunctions 
in  weapons  systems; 

"The  primary  inadequacy  of  the  data  as  presently  collected  in  regard  to 
human-inltlated  malfunctions  is  that  it  is  difficult,  if  not  Inposslble, 
to  refer  the  failure  event  to  any  model  that  describes  the  dynamic  inter¬ 
actions  of  the  failed  item  with  the  humshi  components  of  the  system.  It 
Is,  consequently,  difficult  to  Identify  those  human  components  or 
characteristics  of  the  system  that  might  be  modified  In  order  to  prevent 
recurrence  of  the  failure,  it  is  proposed  here  that  this  Inadequacy  can 
be  overcome  by  modifying  present  failure  reporting  forms  and  procedures 
to  require  an  explicit  Identification  of  the  specific  operation  dturlng 
vhlch  a  failure  is  recognized  as  Such.*' 

3.  End  measures  may  be  of  limited  value  In  assessing  the  performance  of 
subsystems  or  elements  fiuictloning  earlier  In  the  data  flow. 

This  Is  the  case  where  one  Is  testing  an  operational  or  prototype  system 
and  cannot  manipulate  subsystem  features  empsrlmentally.  It  should  be 
a^arent  that  without  such  variation  In  Independent  variables,  the  end 
measure  will  not  point  to  the  source  of  variance  among  the  subsystems, 
or  among  the  heudware,  program  and  hirnan  cenponents.  Even  In  ejqperlmiental 
situations  It  may  be  prefershle  to  seek  Interim  rather  than  end  measures. 


For  example,  in  testing  a  ground-based  GOmputing  and  tracking  system  for 
control  of  interceptor  aircraft,  one  may  prefer  to  obtain  measures  of 
interceptor  position  at  tbe  point  of  presumed  bandover  to  the  aircraft's 
fire  control  system^  as  Parsons^^*  ^3  did  in  field  testing  and  laboratory 
testing  of  the  m/gPA-23.  This  procedure  enables  one  to  hold  constant  any 
effects  on  total  system  performance  coming  from  the  last  subsystem  (when 
total  system  is  defined  as  including  the  autonomous  functioning  of  the 
Interceptor ) . 

On  the  other  hand,  a  case  for  using  end  measures  is  made  by  Hitt  and  Hay5 
in  their  report  on  the  sattelle  ifemorial  institute  laboratory  studies  of 
effectiveness  of  electronic  countermeasures.  Although  they  manipulated 
EGM  parameters  whose  immediate  impacts  occur  amo:^  surveillance  operators, 
their  dependent  variable  was  kill  probability ^  They  state; 

"One  of  the  most  important  conclusions  to  be  derived  from  the  present  re¬ 
search  program  is  that  it  seems  imperative  that  the  evaluation  of  ECM 
effectiveness  be  done  within  a  systems  freuneworki  if  idiis  research  pro¬ 
gram  had  been  designed  to  ascertain  the  effects  of  ecm  on  radar-operator 
performance,  several  measures  of  operator  performance  would  have  been 
available,  in  turn,  the  selected  ECM  displays  could  have  been  rank 
Ordered  on  a  degree-of-effectiveness  continuum,  according  to  the  mean 
operator  performance  scores  achieved  under  the  various  displays.  Based 
upon  previous  work  at  Battelle^  it  is  certain  that  the  results  obtained 
from  such  an  approach  would  have  been  misleading.  (Mean  range  errors 
and  mean  azimuth  errors  made  by  the  operator^  for  example,  appear  to  be 
unrelated  to  Pjt-)" 

In  another  report  by  Gordon,  Hitt,  Ray  and  Wetherbee^,  the  Battelle  in¬ 
vestigators  found  that  surveillance-type  criteria  of  ECM  effectiveness 
such  as  probability  of  establishing  a  track  and  blip/ scan  ratio  were 
highly  related  to  kill  probability  (Pij)*  They  conclude: 

"Although  definite  correlations  have  been  noted  among  the  various  criteria 
which  might  be  used  to  assess  Ecm  effectiveness^  it  is  apparent  that  they 
are  not  equally  good  measures.  Only  a  true  systems  measure  such  as 
probability  of  kill  can  provide  a  meaningful  measure  of  ECM  effectiveness 
against  the  defensive  system  it  is  intended  to  combat.  The  other  measures 
discussed  can  be  used  to  obtain  a  comparison  between  two  t^es  of  cowtera 
measures  used  against  a  given  system,  but  they  cannot  be  used  to  make  com¬ 
parisons  between  systems.  An  absolute  meastire  of  effectiveness,  such  as 
Pjj,  can  be  obtained  from  the  other  criteria  only  if  the  relationship 
between  the  criteria  is  known  for  the  particular  system  under  study.  In 
the  final  analysis  only  a  true  system  measure  can  establish  an  absolute, 
meaningful,  and  generally  applicable  measure  of  ECM  effectiveness. " 


4.  It  can  te  difficult  to  obtain  useful  measUfes  of  data  utilization  Oi* 

effector  ferfomance  in  infoitiatloa  systems  if  tbe  inputs  from  the  sensors 
or  data  processing  portions  are  uncontrolled  or  have  been  degraded* 

The  purpose  of  this  guideline  for  system  trouble-shooting  is  to  indicate 
the  value  of  trying  to  control,  during  a  test  exercise,  the  inputs  into 
the  subsystem  being  tested  and  measured  from  another  subsystem  located 
earlier  in  the  data  flew*  For  example>  in  sAge  the  inputs  into  the 
weapons  subsystem  will  vary  according  to  the  processing  by  the  surveil** 
lance  subsystem,  and  naturally  the  outputs  of  the  weapons  subsystem  will 
i^en  also  vary*  in  developing  a  weapons  subsystem  testing  and  training 
program  for  Mr  Defense  Gommand>  described  by  eocl^ll  and  MurphyB,  we  at 
SDC  wanted  to  st^d&urdize  the  siimilation  inputs*  The  solution  was  to  pre¬ 
program  the  switch  actions  which  the  surveillance  operators  should  have 
taken  and  leave  these  operators  entirely  out  of  the  exercise.  Althou^ 
this  techni^e  was  evolved  for  proficiency  testing  and  training  purposes, 
it  is  applicable  to  system  and  subsystem  diagnosis^  as  we  have  already 
noted  in  the  case  of  Rowell's  surveillance  degradation  study* 

There  is  another  problem  occasioned  by  "serial  eontaminatioa, "  It  is 
possible  that  if  certain  types  of  inputs  are  introduced  into  the  surveil¬ 
lance  portion  of  an  information  system  in  sufficient  quantity,  this  sub¬ 
system  may  not  process  enou^  data  to  provide  useful  inputs  to  i^e  subse¬ 
quent  parts  of  the  system  for  either  training  or  testing*  The  performance 
measures  for  these  other  subsystems  would  be  meaningless  Tor  trouble¬ 
shooting  within  them* 

5*  Some  measures  have  to  be  focussed  explicitly  on  the  interfaces  between 
subsystems* 

The  difficulty  in  atteaq)ting  to  derive  system  performance  data  by  combining 
subsystem  data  has  been  alluded  to  by  Ctoistensen^,  who  commented  that 
"systems  investigators  ep^erience  understandable  anguish  when  they  attempt 
to  define  those  segments  of  the  system  that  cm  be  extracted  and  abstracted 
for  consideration  without  vitiation  of  over-all  results  upon  reassembling 
the  entire  system  (the  'partitioning'  problem)." 

I  suspect  that  behind  this  problem  sonatimes  lies  a  neglect  to  obtain 
interface  measures.  Such  measures  include  those  of  c^omunication  between 
subsystems,  a  class  of  measures  emphasized  by  KiddT.  These  measures  help 
determine  the  fidelity  with  which  the  output  from  one  subsystem  was  actually 
input  into  the  other  subsystem(s).  Wl'^out  fidelity,  assumptions  of  equiva¬ 
lence  between  outputs  from  one  and  inputs  into  another  msiy  be  made  in^roperly 
and  may  be  responsible  for  the  reassembly  problem.  More  than  inter-person 
camnmieations  can  be  involved.  In  eomputer-based  systems,  the  process  of 
digitizing  data  for  transmission  from  a  sensor  into  a  congnxter  may  bring 
about  inequivalence  (see  Parsons^-). 
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6i  Inputs  can  p4x>fltatly  include  "stressp?"  event8>  und  In  militaiy  systeas 
consideration  should  be  given  to  introducing  the  effects  of  hostile 
action. 

Ww  does  one  obtain  measures  of  how  a  system  or  its  subsystems  react  to 
rare  events?  One  technique  is  to  force  the  event  and  increase  its 
frequency,  ^is  has  been  done  for  SAGE  surveillance  training,  as  dess 
cribed  by  okanesll  and  Amoldi,  ny  introducing  simulated  inputs  vhich 
make  the  computer  "track  off"  and  consequently  require  manual  interven¬ 
tion.  To  design  such  inputs,  one  must  test  the  automatic  tracking 
system  to  find  out  the  circumstances  under  which  the  hardware  and  the 
computer  program  cannot  maintain  the  track,  it  should  be  noted  that 
this  is  also  a  technique  for  trouble-shooting  the  system  to  deterMne 
to  which  kinds  of  components  (hardware>  program  or  human)  malfunctions 
should  be  attributed.  Naturally,  one  must  be  certain  that  the  measure¬ 
ment  data  obtained  with  this  forcing  technique  are  produced  by  identi¬ 
fiable  stressors,  special  programs  lirdclng  inputs>  data  reduction  and 
evaluation  have  been  devel^ed  for  this  prupose  (see  Newlands,  Hibler, 
Hanson,  irons>  Katter  end  Levine^O). 

As  a  final  proposed  gi^deiine,  let  me  add  a  favorite  point  of  emphasis. 

It  is  sometimes  preferred  to  measure  system  and  subsystem  perfomance  at 
first  under  unrealistically  sinple  conditions.  In  modern  military  systems 
this  may  mean  omission  of  nuclear  effects  and  electronic  countermeasures. 
Not  Only  can  this  approach  lead  to  delusions  of  system  grandeur  and  a 
protracted  avoidance  of  realistic  evaluation,  but  it  may  forestall  just 
the  kind  of  system  trouble-shooting  which  is  needed  the  most  (and  there¬ 
fore  the  soonest).  Linvilie^  has  stressed  that  evaluation  "would 
certainly  have  to  involve.... a  range  of  enemy  attack  tactics  and  counter¬ 
measures  as  Well  as  a  range  of  defense  weapons. "  In  military  systems, 
measurement  of  system  performance  must  be  based  on  the  effects  of  what 
J.  Meneher  has  called  the  "anti - system. "  One  might  suggest  that  the 
system  and  Its  anti -System  constitute  the  total  system  to  which  measure¬ 
ment  for  trouble-shooting  purposes  should  be  applied. 
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Unclassified  report 

DilSCRiFfOBS:  Coi^and  and  Control  Systems « 

E^lores  applicatiQn  of  the  concept  of 
trouhle  Shooting  as  used  in  the  process 
of  isolating  msJ.functions  to  the  diagnosis 

of  the  performance  of  large  operational  _ 
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UNCLftSSlFlED 

systems.  Repoi’ts  that  the  purpose  is  to 
determine  remedial  action,  and  that  the 
method  used  is  to  mahie  measurements  at 
various  locations ^  Sets  forth  guidelines. 

Concludes  that  in  military  systems  it  is 
unrealistic  to  measure  system  and  sub* 
system  performance  under  simple 
conditions  (such  as  the  omission  of 
nucleef  effects  and  electronic 
countermeasures ) . 
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